|
Detection method of domains generated by dictionary-based domain generation algorithm
ZHANG Yongbin, CHANG Wenxin, SUN Lianshan, ZHANG Hang
Journal of Computer Applications
2021, 41 (9):
2609-2614.
DOI: 10.11772/j.issn.1001-9081.2020111837
The composition of domain names generated by the dictionary-based Domain Generation Algorithm (DGA) is very similar to that of benign domain names and it is difficult to effectively detect them with the existing technology. To solve this problem, a detection model was proposed, namely CL (Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network). The model includes three parts:character embedding layer, feature extraction layer and fully connected layer. Firstly, the characters of the input domain name were encoded by the character embedding layer. Then, the features of the domain name were extracted by connecting CNN and LSTM in serial way through the feature extraction layer. The
n-grams features of the domain name were extracted by CNN and the extracted result were sent to LSTM to learn the context features between
n-grams. Meanwhile, different combinations of CNNs and LSTMs were used to learn the features of
n-grams with different lengths. Finally, the dictionary-based DGA domain names were classified and predicted by the fully connected layer according to the extracted features. Experimental results show that when the CNNs select the convolution kernel sizes of 3 and 4, the proposed model achives the best performance. In the four dictionary-based DGA family experiments, the accuracy of the CL model is improved by 2.20% compared with that of the CNN model. And with the increase of the number of sample families, the CL network model has a better stability.
Reference |
Related Articles |
Metrics
|
|